home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1993
/
Internet Info CD-ROM (Walnut Creek) (1993).iso
/
inet
/
internet-drafts
/
draft-ietf-avt-rtp-02.txt
< prev
next >
Wrap
Text File
|
1993-07-30
|
77KB
|
1,623 lines
Internet Engineering Task Force Audio-Video Transport WG
INTERNET-DRAFT H. Schulzrinne/S. Casner
AT&T/ISI
July 30, 1993
Expires: 10/01/93
RTP: A Real-Time Transport Protocol
Status of this Memo
This document is an Internet Draft. Internet Drafts are working documents
of the Internet Engineering Task Force (IETF), its Areas, and its Working
Groups. Note that other groups may also distribute working documents as
Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six months.
Internet Drafts may be updated, replaced, or obsoleted by other documents
at any time. It is not appropriate to use Internet Drafts as reference
material or to cite them other than as a ``working draft'' or ``work in
progress.''
Please check the I-D abstract listing contained in each Internet Draft
directory to learn the current status of this or any other Internet Draft.
Distribution of this document is unlimited.
Abstract
This draft describes a real-time transport protocol (RTP)
suitable for the network transport of real-time data, such as
audio, video or simulation data for both multicast and unicast
transport services. The data transport is enhanced by a
control protocol (RTCP) designed to provide minimal control and
identification functionality particularly in multicast networks.
RTP and RTCP are designed to be independent of the underlying
transport and network layers. The protocol supports the use of
RTP-level translators and bridges. Within multicast associations,
sites can direct control messages to individual sites.
This specification is a product of the Audio-Video Transport working group
within the Internet Engineering Task Force. Comments are solicited and
should be addressed to the working group's mailing list at rem-conf@es.net
and/or the authors.
INTERNET-DRAFT RTP July 30, 1993
Contents
1 Introduction 2
2 Protocol Conventions 3
3 Real-time Data Transfer Protocol -- RTP 4
3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 RTP Fixed Header Fields . . . . . . . . . . . . . . . . . . . . . . 6
3.3 The RTP Options . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Reverse-Path Option . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Security Options . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6 The Use of the Security Options . . . . . . . . . . . . . . . . . . 15
4 Real Time Control Protocol --- RTCP 17
5 Security Considerations 22
6 RTP over network and transport protocols 23
6.1 Defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2 ST-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
A Implementation Notes 24
A.1 Timestamp recovery . . . . . . . . . . . . . . . . . . . . . . . . 24
A.2 Detecting the Beginning of a Synchronization Unit . . . . . . . . . 25
A.3 Demultiplexing and Locating the Synchronization Source . . . . . . 26
B Addresses of Authors 27
1 Introduction
This draft concisely specifies a real-time transport protocol. A discussion
of the design decisions can be found in the current version of the companion
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 2]
INTERNET-DRAFT RTP July 30, 1993
Internet draft draft-ietf-avt-issues.txt. The transport protocol provides
end-to-end delivery services for one or more s_t_r_e_a_m_s_ of data with real-time
characteristics, for example, interactive audio and video. It does n_o_t_
guarantee delivery or prevent out-of-order delivery, nor does it assume that
the underlying network is reliable and delivers packets in sequence. [The
sequence numbers included in RTP allow the end system to reconstruct the
sender's packet sequence, but sequence numbers may also be used to determine
the proper location of a packet, for example in video decoding, without
necessarily decoding packets in sequence]. RTP is designed to run on top
of a variety of network and transport protocols, for example, IP, ST-II
or UDP. [For most applications, RTP offers insufficient demultiplexing to
run directly on IP.] RTP transfers data in a single direction, possibly to
multiple destinations if supported by the underlying network. A mechanism
for indicating a return path for control data is provided.
While RTP is primarily designed to satisfy the needs of multi-participant
multimedia conferences, it is not limited to that particular application.
Storage of continuous data, interactive distributed simulation, active badge
and control and measurement applications may also find RTP applicable.
Profiles are used to instantiate certain header fields and options for
particular sets of applications. A profile for audio and video data may be
found in the companion Internet draft draft-ietf-avt-profile.txt.
This document defines two packet formats and protocols:
o the real-time transport protocol (RTP) for exchanging data with
real-time properties.
o the real-time control protocol (RTCP) for conveying information about
the sites in an on-going association. RTCP options may be ignored
without affecting the ability to correctly receive data. RTCP is used
for loosely controlled conferences, i.e., where there is no explicit
admission control and set-up. Its functionality may be subsumed by
a conference control protocol (which is beyond the scope of this
document).
2 Protocol Conventions
Control fields (options) for RTP and RTCP share the same structure and
numbering space and are carried within the same packet. Options may appear
in any order, unless specifically restricted by the option description.
[The position of some security options may have significance.] Each option
consists of the final bit, the option type designation, a one-octet length
field denoting the total number of 32-bit long words comprising the option
(including final bit, type and length), and finally any option-specific
data. The last option before the packet data portion (``payload'') has the
'F' (final) bit set to one, for all other options this field has a value of
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 3]
INTERNET-DRAFT RTP July 30, 1993
zero.
Fields within the fixed header and within options are aligned to the natural
length of the field, i.e., 16-bit words are aligned on even addresses,
32-bit long words are aligned at addresses divisible by four, etc. Octets
designated as padding have the value zero. Options unknown to the RTP
implementation or the application are to be ignored. Options with option
types having values from 64 to 127 inclusive are to be used for private
extensions. Fields designated as 'reserved' or 'R' are set aside for future
use; they should be set to zero by senders and ignored by receivers.
All integer fields are carried in network byte order, that is, most
significant byte (octet) first. The transmission order is described in
detail in [1], Appendix A. Unless otherwise noted, numeric constants are in
decimal (base 10). Numeric constants prefixed by '0x' are in hexadecimal.
Textual information is encoded accorded to the UTF-2 encoding of the ISO
standard 10646 (Annex F) [2,3]. US-ASCII is a subset of this encoding and
requires no additional encoding. The presence of multi-byte encodings is
indicated by setting the most significant bit to a value of one. A byte
with a binary value of zero may be used as a string terminator for padding
purposes.
[Text in square brackets is intended to motivate the design decisions made.]
3 Real-time Data Transfer Protocol -- RTP
3.1 Definitions
P_a_y_l_o_a_d_ is the data following the RTP fixed header and the RTP/RTCP options.
The payload format and interpretation are beyond the scope of this memo. A
valid RTP packet may carry no payload.
An R_P_D_U_ stands for RTP protocol data unit. It consists of the encapsulation
specific to a particular underlying protocol, the fixed RTP header, RTP and
RTCP options (if any) and the payload, if any.
A s_y_n_c_h_r_o_n_i_z_a_t_i_o_n_ s_o_u_r_c_e_ is the combination of one or more content sources
with its own timing. The RPDUs emitted by a synchronization source
have non-decreasing sequence numbers and time stamps (modulo their field
lengths). The audio coming from a microphone or the video from a source
are examples of synchronization sources. Typically, a single source emits a
single medium (e.g., audio or video). A synchronization source is a member
of exactly one channel, as defined below. A synchronization source may
change its data format over time. Synchronization sources are identified
by their source network address, source transport address (e.g., UDP source
port) and the value of SSRC identifier carried in the SSRC option. If the
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 4]
INTERNET-DRAFT RTP July 30, 1993
SSRC option is not present, a value of zero for that identifier is assumed.
A c_o_n_t_e_n_t_ s_o_u_r_c_e_ is the actual source of the data carried, for example, the
user and host that originally generated the audio data. One or more content
sources may contribute data for one synchronization source. Content sources
are used for identifying the logical source of the data; they have no effect
on the delivery of the data itself.
A n_e_t_w_o_r_k_ s_o_u_r_c_e_ is the network-level origin of the RPDUs as seen by the
receiving end system.
All sources sending to the same destination network address and
transport-level address using the same RTP flow identifier belong to same
c_h_a_n_n_e_l_.
An e_n_d_ s_y_s_t_e_m_ generates the content to be used in RTP packets and delivers
the content of received RTP packets to the user application. An end system
can act as one or more synchronization sources. (Most end systems are
expected to be a single synchronization source.)
An (RTP-level) b_r_i_d_g_e_ receives RTP packets from one or more sources,
combines them in some manner and then forwards a new RTP packet. A bridge
may change the data format. Since the timing among multiple input source
will not generally be synchronized, the bridge will make timing adjustments
among the streams and generate its own timing for the combined stream.
Therefore, bridges are synchronization sources, with each of the sources
whose packets were combined into an outgoing RTP packet as the content
sources for that outgoing packet. Audio bridges and media converters are
examples of bridges. Example: assume SMITH@FOO and JONES@BAR are using a
bridge to translate their audio from one encoding to another. The bridge
mixes audio packets from Smith and Jones together and forwards the mixed
packets. If, say, Smith was talking, she is indicated as the content
source of the outgoing packet, allowing the receiver to properly display the
current speaker rather than just the bridge that mixed the audio. For
an end system receiving RTP packets from that bridge, the bridge is the
synchronization source and Smith the content source. The RTP-level bridges
described in this document are unrelated to the data link-layer bridges
found in local area networks. If there is possibility for confusion, the
term 'RTP-level bridge' should be used. [The name 'bridge' follows common
telecommunication usage.]
An (RTP-level) t_r_a_n_s_l_a_t_o_r_ does not alter the timing of packets. Examples of
its use include encoding conversion without mixing or retiming, conversion
from multicast to unicast, and application-level filters in firewalls.
A translator is neither a synchronization nor a content source. The
properties of bridges and translators are summarized in Table 1. Checkmarks
in parentheses designate possible, but unlikely actions.
A s_y_n_c_h_r_o_n_i_z_a_t_i_o_n_ u_n_i_t_ consists of one or more packets that, as a group,
share a common fixed delay between generation and playout of each part of
the group, or can only be scheduled as a whole. The delay may change at the
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 5]
INTERNET-DRAFT RTP July 30, 1993
end sys. bridge translator
mix sources -- x --
change encoding N/A x x
encrypt x x (x)
sign for authentication x x --
touch content x x (x)
insert CSRC -- x --
insert SSRC x x x
insert SDST x x --
insert SDES x x --
Table 1: The properties of end systems, bridges and translators
beginning of such a synchronization unit. The most common synchronization
units are talkspurts for voice and frames for video transmission.
N_o_n_-_R_T_P_ m_e_c_h_a_n_i_s_m_s_ refers to other protocols and mechanisms that may be
needed to provide a useable service. In particular, for multimedia
conferences, a conference control application may distribute encryption
and authentication keys, negotiate the encryption algorithm to be used,
determine the mapping from the RTP format field to the actual data format
used. For simple applications, electronic mail or a conference database may
also be used. The specification of the mechanism itself is outside the
scope of this memorandum.
3.2 RTP Fixed Header Fields
The RTP header has the following format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver| FlowID |P|S| format | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp (seconds) | timestamp (fraction) |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| options ... |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
The fields in the first eight octets are present in every RTP packet and
have the following meaning:
protocol version: 2 bits
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 6]
INTERNET-DRAFT RTP July 30, 1993
Defines the protocol version. The version number of the protocol
defined in this memo is one.
FlowID: 6 bits
The value of the field is the flow identifier, which forms part of
the tuple identifying a channel (see definition above). [The flow ID
field is convenient if several different channels are to receive the
same treatment by the underlying layers or if a profile allows for
the concatenation of several RPDUs on different channels into a single
protocol data unit of the underlying protocol layer.]
option present bit (P): 1 bit
This flag has a value of one if the fixed RTP header is followed by one
or more options and a value of zero otherwise.
end-of-synchronization-unit (S): 1 bit
This flag has a value of one in the last packet of a synchronization
unit, a value of zero otherwise. [As shown in Section A, the beginning
of a synchronization unit can be readily established from this flag.
If this flag were to signal to the beginning of a synchronization unit,
the end of a synchronization unit could not be established in real
time.]
format: 6 bits
The 'format' field forms an index into a table defined through
the RTCP FMT option or non-RTP mechanisms (see Section 3.1. The
mapping establishes the format of the RTP payload and determines its
interpretation by higher layers. If no mapping has been defined in
this manner, a standard mapping is specified by the companion profile
document, RFC TBD. Also, default formats may be defined by the current
edition of the Assigned Numbers RFC.
sequence number: 16 bits
The sequence number counts RPDUs. The sequence number increments by
one for each packet sent. [The sequence number may be used by the
receiver to detect packet loss, to restore packet sequence and to
identify packets to the application.]
timestamp: 32 bits
The timestamp reflects the wallclock time when the RPDU was generated.
The timestamp consists of the middle 32 bits of a 64-bit NTP timestamp,
as defined in RFC 1305 [4]. Several consecutive packets may have equal
timestamps.
The timestamp of the first packet(s) within a synchronization unit
is expected to closely reflect the actual sampling instant, measured
by the local system clock. The local system clock should be
controlled by a time synchronization protocol such as NTP if such
a service is available. It is not expected that the local system
clock be referenced to obtain the timestamp for the beginning of
every synchronization unit, but the local clock should be referenced
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 7]
INTERNET-DRAFT RTP July 30, 1993
frequently enough so that clock drift between synchronized system clock
and sampling clock can be compensated for gradually. Within one
synchronization unit, it may be appropriate to compute timestamps based
on the logical timing relationships between the packets. For audio
samples, for example, the nominal sampling interval may be used. If
the clock quality field of the CDES option does not indicate otherwise,
it is assumed that the timestamp at the beginning of a synchronization
unit is derived from a synchronized system clock. However, it is
allowable to operate without synchronized time on those systems where
it is not available, unless a profile or session protocol requires
otherwise.
3.3 The RTP Options
The packet header may be followed by options and the payload. Options are
summarized below. Unless otherwise noted, each option may appear only once
per packet. Each packet may contain any number of options. A conforming
implementation of RTP has to support the RTP options listed here, unless
otherwise noted.
CSRC 0 Content source identifiers. The content source option is inserted
only by bridges and identifies all sources that contributed to the
packet. For example, for audio packets, all sources that were mixed
together to create this packet are listed, allowing correct talker
indication at the receiver. Each CSRC option may contain one or
more content source identifiers, each 16 bits long. The identifier
values must be unique for all content sources received through a
particular synchronization source (bridge) on a particular channel;
the value of binary zero is reserved and may not be used. If the
number of content sources is even, the two octets needed to pad the
list to a multiple of four octets are set to zero. There should
only be a single CSRC option within a packet. If no CSRC option is
present, the content source identifier is assumed to have a value of
zero. CSRC options are not modified by RTP-level translators.
A conformant RTP implementation does not have to be able to generate
or interpret the CSRC option.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| CSRC | length | content source identifier ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
SSRC 1 Synchronization source identifier. The SSRC option may be
inserted by RTP-level translators, end systems and bridges. It
is typically used only by translators, but it may be used by an
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 8]
INTERNET-DRAFT RTP July 30, 1993
end system application to distinguish several sources sent with the
same lower-layer source address. Each synchronization source with
the same lower-layer address (e.g., the same IP address and UDP
port) must have a distinct SSRC. Synchronization sources that are
distinguishable by their lower-layer address do not require the use
of SSRC options. The SSRC value zero is reserved and must not be
used. If no SSRC option is present, the network source is assumed
to indicate the synchronization source. There must be no more than
one SSRC identifier per packet; thus, a translator must remap the
SSRC identifier of an incoming packet into a new, locally unique
SSRC identifier. The SSRC option may be considered in functionality
as an extension of the source port number in protocols like UDP,
ST-II or TCP.
A RTP receiver must support the SSRC option. RTP senders only need
to support this option if they intend to send more than one source
to the same channel using the same source port.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| SSRC | length = 1 | identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
BOP 3 (beginning of playout unit) 16-bit sequence number designating the
first packet within the current playout unit.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| BOP | length = 1 | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3.4 Reverse-Path Option
With two-party (unicast) communications, relaying back control information
to the sender is easy. For multicast communications, control information
can be sent to all members of the group. It may, however, be desirable to
send a message to an individual member of a multicast group, for example
to request retransmission of a particular data frame or to request/send a
reception quality report. For this particular use, we introduce a mechanism
for sending so-called reverse RPDUs. The RPDU format of reverse RPDUs is
exactly the same as for regular messages and they can make use of all
the options defined in this memorandum. Reverse RPDUs travel through the
same translators as other RPDUs. The receiver distinguishes reverse RPDUs
by their arrival on a different transport selector (e.g., a different UDP
port), namely the same one which is used as a source transport selector
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 9]
INTERNET-DRAFT RTP July 30, 1993
(e.g., UDP source port) for forward RPDUs. A receiver of reverse RDPUs
cannot rely on any sequence number ordering, as a sender may use the same
sequence number space while communicating through this reverse mechanism
with several receivers. The sequence number space of reverse RPDUs has to
be completely separate from that used for RPDUs sent to the multicast group.
If the same sequence number space were used, the members of the multicast
group not receiving reverse RPDUs would detect a gap in their received
sequence number space.
SDST 2 Synchronization destination identifier. The SDST option is only
inserted by RTP end systems and bridges if they want to send
unicast information to a particular site within the multicast group.
Packets containing an SDST option must not contain an SSRC option
and vice versa. The identifier value zero is allowed, unlike for
SSRC options (see example below).
Denote the the end system that wants to return a unicast message by
S and the desired destination end system of that unicast message by
D. If the multicast packets received by S from D contain no SSRC
option, S and D must be directly connected, without an intervening
translator. No SDST option is need in this case.
If the multicast packet received by S from D contain an SSRC option,
S inserts an SDST option using the identifier contained in the
SSRC option received from D. D then forwards the packet to the
source network and transport address found in the multicast packets
coming from D. The packet will thus reach the translator on the
path between S and D closest to S. The arrival on that transport
address tells the translator that the packet is a unicast reverse
control packet. The translator determines which source it maps
into the identifier contained in the SDST option and replaces the
SDST identifier by that value. In other words: if a forward RTP
packet carries SSRC identifier X between two systems (either two
translators or an end system and a translator), the unicast reverse
control packet will carry SDST with identifier X between those two
systems.
Example for UDP: T1 and T2 are translators between end systems
S and D. In the forward direction, D sends regular RTP packets
with no SSRC to (among other multicast group members) translator T2
with destination port 3456 and source port 5678; T2 inserts SSRC
identifier 13 and forwards to translator T1 on source port 4590
and destination port 3456; T1 translates SSRC 13 into SSRC 8 and
forwards to S using destination port 3456 and source port 12789. In
the unicast reverse RPDU, site S sends the packet to translator T1,
destination port 12789 with SDST value 8. T1 replaces SDST value 8
with SDST value 13 and forwards to translator T2 with destination
port 4590. T2 finally sends the message with SDST value 0 to site D
at destination port 5678. By its arrival port, site D determines
that the RPDU is a reverse RPDU and treat it accordingly.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 10]
INTERNET-DRAFT RTP July 30, 1993
[Reverse control unicast packets are already identified by their
destination transport address, so SSRC could be used for reverse
control packets. A separate option is used to limit confusion.]
Only applications that need to send or receive unicast control
information flow need to implement the SDST option.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| SDST | length = 1 | identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3.5 Security Options
The security options below offer message integrity, authentication and
privacy and the combination of the three.
Support for the security options is not mandatory, but see the discussion
for the ENC option. The four message integrity check options --- MIC, MICA,
MICK and MICS --- are mutually exclusive, i.e., only one of them should be
used for a single RPDU.
All message integrity check options are computed over the fixed header,
the ENC option preceding the message integrity check option (if present),
the first four octets of the message integrity check option and the data
(remaining header and payload) following the message integrity check option.
The message integrity check options and the ENC option shall not cover
the SSRC and SDST options, i.e., SSRC and SDST must be inserted between
the fixed header and the ENC or message integrity check options, as SSRC
and SDST are subject to change by translators that are likely not in
possession of the necessary descriptor table (see below) and encryption
keys. Translators that have the necessary keys and descriptor translation
table may modify the contents of the RPDU, unless the MICA option is used
(see MICA description).
All security options carry a one-octet descriptor field. This descriptor
is an index into two tables, one for the message integrity check options,
one for the ENC option, established by non-RTP means, containing digest
algorithms (MD2, MD5, etc.), encryption algorithms (DES variants) and
encryption keys or shared secrets (for the MICK option). All sources within
the same channel share the same table. The descriptor value may change
during a session, for example, to use a different set of encryption keys.
The descriptor value zero describes a set of default algorithms to be used:
MD5 for the message digest algorithm, DES CBC for the encryption algorithm.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 11]
INTERNET-DRAFT RTP July 30, 1993
The MIC, MICK and MICS message integrity checks offer g_r_o_u_p_ a_u_t_h_e_n_t_i_c_a_t_i_o_n_,
that is, the receiver can ascertain that the RPDU originated from a member
of the group of sites sharing a common secret, but the receiver cannot
authenticate which of the sources among that group sent the data. The
receiver can also be assured that nobody outside the group tampered with the
RPDU.
ENC 8 All packet data after this option, but not the fixed header,
is encrypted, using the encryption key and symmetric encryption
algorithm specified by the descriptor field. The descriptor value
may change over time to accomodate varying security requirements or
reduce the amount of ciphertext using the same key. [For example,
in a network interview, the candidate and interviewers could share
one key, with a second key set aside for the interviewers only. For
symmetric keys, source-specific keys offer no advantage.]
The descriptor value zero is reserved for a default mode using
the Data Encryption Standard (DES) algorithm in CBC (cipher block
chaining) mode, as described in Section 1.1 of RFC 1423 [5]. The
padding specified in that section is to be used. The 8-octet
initialization vector (IV) may be carried unencrypted within the ENC
option or generated anew for each packet. If the ENC option does
not contain an initialization vector (indicated by an option length
of 1), the fixed RTP header is used as the IV. [Using the fixed
RTP header as the IV avoids regenerating the IV for each packet
and incurs less header overhead.] For details on the tradeoffs
for CBC IV use, see [6]. Support for encryption is not required.
Implementations that do not support encryption should recognize the
ENC option so that they can avoid processing encrypted messages
and provide a meaningful failure indication. Implementations that
support encryption should, at the minimum, always support the DES
CBC algorithm.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| ENC | length = 3 | reserved | descriptor |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DES (CBC) initialization vector, bytes 0 through 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DES (CBC) initialization vector, bytes 4 through 7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| ENC | length = 1 | reserved | descriptor |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 12]
INTERNET-DRAFT RTP July 30, 1993
MIC 9 Messsage integrity check. The MIC option option is used only in
combination with the ENC option immediately preceding it to provide
privacy and group membership authentication. The message integrity
check uses the digest algorithm specified by the descriptor field.
The value zero implies the use of the MD5 message digest. Note that
the MIC option is not separately encrypted.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| MIC | length | reserved | descriptor |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| message digest (unencrypted) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MICA 10 Message integrity check, asymmetric encryption. Currently, only
the use of the MD2 and MD5 message digest algorithms is defined, as
described in RFC 1319 [7] (as corrected in Section 2.1 of RFC 1423)
and RFC 1321 [8], respectively. The MD2 and MD5 message digests are
16 octets long.
``To avoid any potential ambiguity regarding the ordering of the
octets of an MD2 message digest that is input as a data value
to another encryption process (e.g., RSAEncryption), the following
holds true. The first (or left-most displayed, if one thinks in
terms of a digest's "print" representation) octet of the digest
(i.e., digest[0] as specified in RFC 1319), when considered as
an RSA data value, has numerical weight 2**120. The last (or
right-most displayed) octet (i.e., digest[15] as specified in RFC
1319) has numerical weight 2**0.'' [RFC 1423, Section 2.1]
``To avoid any potential ambiguity regarding the ordering of the
octets of a MD5 message digest that is input as an RSA data value to
the RSA encryption process, the following holds true. The first (or
left-most displayed, if one thinks in terms of a digest's "print"
representation) octet of the digest (i.e., the low-order octet of
A as specified in RFC 1321), when considered as an RSA data value,
has numerical weight 2**120. The last (or right-most displayed)
octet (i.e., the high-order octet of D as specified in RFC 1321) has
numerical weight 2**0.'' [RFC 1423, Section 2.2]
The message digest is encrypted, using asymmetric keys, with the
sender's private key using the algorithm described in Section 4.2.1
of RFC 1423: ``As described in PKCS #1, all quantities input as
data values to the RSAEncryption process shall be properly justified
and padded to the length of the modulus prior to the encryption
process. In general, an RSAEncryption input value is formed by
concatenating a leading NULL octet, a block type BT, a padding
string PS, a NULL octet, and the data quantity D, that is, RSA input
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 13]
INTERNET-DRAFT RTP July 30, 1993
value = 0x00, BT, PS, 0x00, D. To prepare a MIC for RSAEncryption,
the PKCS #1 ``block type 01'' encryption-block formatting scheme
is employed. The block type BT is a single octet containing the
value 0x01 and the padding string PS is one or more octets (enough
octets to make the length of the complete RSA input value equal
to the length of the modulus) each containing the value 0xFF. The
data quantity D is comprised of the MIC and the MIC algorithm
identifier.''. The encoding is described in detail in RFC 1423.
For encrypting MD2 and MD5, the data quantity D is comprised of the
16-byte checksum, preceded by the binary sequences shown here in
hexadecimal: 0x30, 0x20, 0x30, 0x0C, 0x06, 0x08, 0x2A, 0x86, 0x48,
0x86, 0xF7, 0x0D, 0x02, 0x02, 0x05, 0x00, 0x04, 0x10 for MD2 and
0x30, 0x20, 0x30, 0x0C, 0x06, 0x08, 0x2A, 0x86, 0x48, 0x86, 0xF7,
0x0D, 0x02, 0x05, 0x05, 0x00, 0x04, 0x10 for MD5.
The signature is padded as necessary. The value of the padding is
left unspecified. [Note: The number of non-padding bits within the
signature is known to the receiver as being equal to the key length.
The MIC algorithm is identified through the bytes prepended to the
actual 16-byte signature.]
Contrary to what is specified in RFC 1423 for privacy enhanced mail,
the asymmetrically signed MIC is carried in binary, NOT represented
in the printable encoding of RFC 1421, Section 4.3.2.4. The
encrypted length of the signature will be equal to the modulus of
the RSA encryption used, rounded to the next integral byte count.
The modulus and public key is conveyed to the receivers by non-RTP
means. [Note: Asymmetric keys are used since symmetric keys would
not allow authentication of the individual source in the multicast
case.]
A translator that receives an RPDU is not allowed to modify the
parts of the RPDU covered by the MICA option as the receiver would
have no way of establishing the identity of the translator and thus
could not verify the integrity of the RDPU.
Support for sending or interpreting MICA options is not required.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| MICA | length | encrypted message-digest ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MICK 11 Message integrity check, keyed. This message integrity check
does not require encryption. In addition to the RPDU parts to be
included in the message digest according to the introduction to this
section, the shared secret is placed in the MICK option and included
in the message digest. (The shared secret is equivalent to the
key used for the MICS and ENC options, but is 16 octets long, if
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 14]
INTERNET-DRAFT RTP July 30, 1993
necessary by padding with binary zeroes.) The shared secret in the
MICK option is then replaced by the computed 128-bit digest.
The receiver saves the message digest contained in the MICK option,
replacing it with the shared secret key and computes the message
digest in the same manner as the sender. If the RPDU has not been
tampered and originated with one of the holders of the secret key,
the computed message digest will agree with the digest found on
reception in the MICS option.
[The message integrity check follows the practice of SNMP Version 2,
as described in RFC 1446, Section 1.5.1. The MICS option itself
is covered by the digest in order to detect tampering with the
descriptor field itself. Using the secret key in the signature
instead of encrypting the MD5 message digest avoids the use of an
encryption algorithm when only authentication is desired. However,
the security of this approach has not been as well established as
that based on encrypting message digests, as used in the MICS, MIC
and MICA options.]
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| MICS | length | reserved | descriptor |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| encrypted message digest ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MICS 12 Message integrity check, symmetric-key encrypted. This message
integrity check encrypts the message digest using DES ECB mode as
described in RFC 1423, Section 3.1.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| MICS | length | reserved | descriptor |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| encrypted message digest ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3.6 The Use of the Security Options
Combinations of the message integrity check and ENC security options can be
used to provide a variety of security services:
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 15]
INTERNET-DRAFT RTP July 30, 1993
confidentiality: Confidentiality here means that only the intended
receiver(s) can decode the received RTP packets; for others, the data
contains no useful information. Confidentiality of the content is
achieved by encryption using DES. The presence of encryption and the
initialization vector is indicated by the ENC option. [Note: for
efficiency reasons, this specification does not insist that content
encryption only be used in connection with message integrity and
authentication mechanisms. In most all cases, it will be obvious to
the person receiving the data if he or she does not possess the right
encryption key.]
authentication and message integrity: In combination with certificates, the
receiver can ascertain that the claimed originator is indeed the
originator of the data (authentication) and that the data has not
been altered after leaving the sender (message integrity). These two
security services are provided by the message integrity check options.
Certificates for MICA must be distributed through means outside of RTP.
The services offered by MICA and MIC/MICK/MICS differ: MIC/MICK/MICS
differ: With MIC/MICK/MICS, the receiver can only verify that the
message originated within the group holding the secret key, rather than
authenticate the sender of the message, while the MICA option affords
true authentication of the sender.
authentication, message integrity, and confidentiality: By carrying both
the message integrity check and ENC option in RTP packets, the
authenticity, message integrity and confidentiality of the packet can
be assured (subject to the limitations discussed in the previous
paragraph).
The message integrity check is applied first to the all parts of the
outgoing packet to be authenticated, and the message integrity check
option is prepended to those parts. Then, the packet including the
message integrity check option is encrypted using the shared secret
key. The ENC option must be followed immediately by the message
integrity check option, without any other options in between. The
receiver first decrypts the octets following the ENC option and then
authenticates the decrypted data using the signature contained in the
message integrity check option.
For this combination of security features and group authentication, the
combination ENC and MIC is recommended (instead of MICS or MICK) as it
yields the lowest processing overhead.
A message integrity check option followed by an ENC option should not be
used.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 16]
INTERNET-DRAFT RTP July 30, 1993
4 Real Time Control Protocol --- RTCP
The real-time control protocol (RTCP) conveys minimal control and advisory
information during a conference. It provides support for loosely controlled
conferences, i.e., where participants enter and leave without admission
control and parameter negotiation. The services provided by RTCP services
enhance RTP, but an end system does not have to implement RTCP features to
participate in conferences(1). RTCP does not aim to provide the services
of a conference control protocol and does not provide some of the services
desirable for two-party conversations. If a conference control protocol is
in use, the services of RTCP should not be required. (Note: as of the
writing of this document, a conference or session control protocol has not
been specified within the Internet.)
Unless otherwise noted, control information is carried periodically as
options within RPDUs, with or without payload. RTCP packets are sent to
all members of a conference. These packets are part of the same sequence
nubmer space as RTP packets not containing RTCP options. The period should
be varied randomly to avoid synchronization of all sources and its mean
should increase with the number of participants in the conference to limit
the growth of the overall network and host interrupt load. The length of
the period determines, for example, how long a receiver joining a conference
has to wait in the worst case until it can identify the source. A receiver
may remove from its list of active sites a site that it has not heard from
for a given time-out period; he time-out period may depend on the number of
sites or the observed average interarrival time of RTCP messages. Note that
not every periodic message has to contain all RTCP options; for example, the
MAIL part within the SDES option might only be sent every few messages.
The item types are defined below:
FMT 32 Format description.
format: 6 bits
The 'format' field corresponds to the index value from the
'format' RTP fixed header field, with values ranging from 0 to
63.
Clock quality: 8 bits
Provides an indication as to the sender-perceived quality of
the timestamps in the RTP header. The octet is interpreted
as a quantity indicating the maximum dispersion to a root time
server measured in fractions of a second and expressed as a
------------------------------
1. There is one exception to that rule: if an application sends FMT
options, the receiver has to decode these in order to properly interpret the
RTP payload.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 17]
INTERNET-DRAFT RTP July 30, 1993
power of two.
If a source is known to be synchronized to standard time, but
with an unknown dispersion, or the dispersion is greater than
TBD, the value TBD is used. If the clock is based on the
nominal sample rate of the source, a value of TBD is used.
The clock quality indication can be used to judge how the delay
measurements reported by the QOS option can be interpreted (as
absolute delay or only as delay variation). It is also useful
for determining to what extent several sources with different
clocks can be synchronized.
Format-dependent data: variable
Format-dependent data may or may not appear in a FMT option.
It is passed to the next layer and not interpreted by RTP.
A FMT mapping changes the interpretation of a given 'format' value
(as carried in the fixed RTP header) starting at the packet
containing the FMT option. The new interpretation applies only
to packets from the synchronization source of this packet. A
sender should refrain from changing the mappings between the RTP
format field and the other fields in the FMT option that have been
established through a conference registry, a conference announcement
protocol or otherwise. Dynamic changes to these values may result
in misinterpretation of RTP payload if the packet(s) containing the
FMT option are lost.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| FMT | length |R|R| format | clock quality |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| format-dependent data ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
SDES 33 This option provides a mapping between a numeric source identifier
and one or more identifying attributes. [Several attributes were
combined into one option to avoid multiple mappings from identifiers
to the receiver site data structure.] For those applications where
the size of a multipart SDES option would be a concern, multiple
SDES options may be formed with subsets of the parts to be sent in
separate packets.
An end system or a bridge uses an identifier value of zero to
identify itself. For each contributor, a bridge forwards the SDES
information received from that contributor, but changes the SDES
source identifier to correspond to the value used in the CSRC option
when identifying this contributor. A bridge that contributes data
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 18]
INTERNET-DRAFT RTP July 30, 1993
to outgoing packets should use a CSRC and select another non-zero
source identifier for that traffic and send CSRC and SDES options
for it.
Translators do not modify or insert SDES options. The end system
performs the same mapping it uses to identify the content sources
(that is, the combination of network source, synchronization source
and the source number within this SDES option) to identify a
particular source. SDES information is specific to a particular
flow identifier, unless a higher-layer control protocol defines
that all packets with the same source identifier (network and
transport-level source addresses and the optional SSRC value) from a
set of channels defined by the control protocol are described by the
same SDES.
Currently, the following items are defined. Each has a structure
similar to that of RTCP and RTP options, that is, a type field
followed by a length field (measured in multiples of four octets).
No final bit is needed since the overall length is known. All
of the SDES items are optional; however, if quality-of-service
monitoring is to be used, the ADDR and TSEL items need to be
provided (see QOS option).
type value description
ADDR 1 network address of source
TSEL 2 transport address
CNAME 4 canonical user and host identifier,
e.g., ``doe@sleepy.megacorp.com'' or
``sleepy.megacorp.com''
MAIL 5 user's electronic mail address
e.g., ``John.Doe@megacorp.com''
LOC 8 geographic user location,
e.g., ``Rm. 2A244, Berkeley Heights, NJ''
TXT 16 text describing the source,
e.g.,``John Doe, Bit Recycler, Megacorp''
Items are padded with the binary value zero to the next multiple of
four octets. Each item may appear only once unless otherwise noted.
A more description of the content of some of these types follows:
ADDR: A source may send several network addresses, but only one
for each address type value. Address types are identified by
the Domain Name Service Resource Record (RR) type, as specified
in the current edition of the Assigned Numbers RFC. For NSAP
addresses, the NSEL byte is not included.
TSEL: The protocol identifier uses the IP Protocol Numbers defined
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 19]
INTERNET-DRAFT RTP July 30, 1993
in the current edition of the Assigned Numbers RFC. The
figure shows the use of the TSEL item for the TCP and UDP
protocols. There must be no more than one TSEL item in an SDES
option. The TSEL item should precede any address information.
[Multiple concurrent transport addresses are not meaningful.
The ordering simplifies processing at the receiver.]
CNAME: The CNAME item must have the format ``user@host'' or
``host'', where ``host'' is the fully qualified domain name of
the host where the real-time data originates from, formatted
according to the rules specified in RFC 1034, RFC 1035 and
Section 2.1 of RFC 1123. The ``host'' form may be used if a
user name is not available, for example on single-user systems.
The user name should be in a form that a program such as
``finger'' or ``talk'' could use, i.e., it typically is the
login name rather than the ``real life'' name. Note that the
host name is not necessarily identical to the electronic mail
address of the participant. The latter is provided through the
MAIL item.
LOC: Depending on the application, different degrees of detail are
appropriate for this item. For conference applications, a
string like ``Tampere, Finland'' may be sufficient, while for
an active badge system, strings like ``Room 2A244, AT&T BL MH''
might be appropriate.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| SDES | length | source identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = ADDR | length | reserved | address type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| network-layer address ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = ADDR | length = 2 | reserved | addr. type = 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv4 address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = TSEL | length | reserved | transport pro.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| transport-address (port number) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 20]
INTERNET-DRAFT RTP July 30, 1993
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = TSEL | length | reserved | protocol = 6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reserved | TCP port number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = TSEL | length | reserved | protocol = 17 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reserved | UDP port number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = CNAME | length | user and domain name ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = MAIL | length | electronic mail address ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = LOC | length | geographic location of site ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = TXT | length | text describing source ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
BYE 35 The BYE option indicates that a particular site is no longer
active. A bridge sends BYE options with a (non-zero) content source
value. An identifier value of zero indicates that the source
indicated by the synchronization source (SSRC) option and network
address is no longer active. If a bridge shuts down, it should
first send BYE options for all content sources it handles, followed
by a BYE option with an identifier value of zero. Each RTCP message
can contain one or more BYE messages. [Multiple identifiers in a
single BYE option are not allowed to avoid ambiguities between the
special value of zero and any necessary padding.]
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| BYE | length = 1 | content source identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
QOS 36 Quality of service measurement. The QOS options describes
statistics of a single synchronization source. The synchronization
source is identified by one of the ADDR items from the SDES option
together with the TSEL item from the SDES option. The SDES items
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 21]
INTERNET-DRAFT RTP July 30, 1993
are appended directly to the fixed-length part of the QOS option,
with TSEL following ADDR. For a description of these items, see the
SDES option.
The other fields of the option contains the number of packets
received (32 bits), the number of packets expected (32 bits), the
minimum delay, the maximum delay and the average delay. The delay
measures are encoded as 16/16 NTP timestamps, that is, 16 bits
encode the number and seconds and 16 bits the fraction of a second.
A single RTCP packet may contain several QOS options. It is left
to the implementor to decide how often to transmit QOS options
and which sources are to be included. [The timestamp format
is identical to the one used in the fixed RTP header. The
quality-of-service information is identical to that carried in the
reverse control option.]
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| QOS | length | synchronization source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| packets expected |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| packets received |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| minimum delay (seconds) | minimum delay (fraction) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| maximum delay (seconds) | maximum delay (fraction) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| average delay (seconds) | average delay (fraction) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = ADDR | length | reserved | address type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| network-layer address ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| type = TSEL | length | reserved | transport pro.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| transport-address (port number) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5 Security Considerations
IP multicast provides no direct means for a sender to know all the receivers
of the data sent. RTP options make it easy for all participants in a
conference to identify themselves; if deemed important for a particular
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 22]
INTERNET-DRAFT RTP July 30, 1993
application, it is the responsibility of the application writer to make
listening without identification difficult. It should be noted, however,
that within an internet, privacy of the payload can generally only be
assured by encryption.
The periodic transmission of session messages may make it possible to detect
denial-of-service attacks. For many types of payload expected to be carried
in RTP packets, such as compressed audio and video, the data is very close
to white noise, making statistics-based ciphertext-only attacks difficult.
Without MICS/MICA options, it may even be difficult to detect automatically
when the code has been broken. However, the session information is
more or less constant and predictable, allowing known-plaintext attacks.
Chosen-plaintext attacks appear to be difficult.
Since the timestamp in the RTP header is protected by the message integrity
check options, some replay attacks can be detected if the receiver can bound
the maximum packet delay and clock offset of the sender.
Without authentication, the SDES fields may be used to impersonate another
site. Impersonation and denial-of-service attacks can be made more
difficult by providing digital signatures for all or parts of a message.
The MICA or MICS and ENC RTP options described in Section 3 support
privacy within group communications. The issues of key distribution and
a certification hierarchy are outside the scope of this document. A
direct mapping of all PEM header fields into RTCP option types would
be straightforward and would allow reuse of existing PEM implementations.
However, it is questionable whether loose conference control is the
appropriate mechanism for distributing key and certificate information.
6 RTP over network and transport protocols
This section describes issues specific to carrying RPDUs over particular
network and transport protocols.
6.1 Defaults
The following rules apply unless superseded by protocol-specific subsections
in this section.
If RTP protocol data units (RPDU) are carried over underlying protocols that
provide the abstraction of a continuous bit stream rather than messages,
each RPDU is prefixed by a 32-bit framing field containing the length of
the RPDU measured in octets, not including the framing field itself. If an
RPDU traverses a path over a mixture of octet-stream and message-oriented
protocols, each RTP-level bridge between these protocols is responsible for
adding and removing the framing field. A profile may determine that framing
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 23]
INTERNET-DRAFT RTP July 30, 1993
is to be used for protocols that do provide framing in order to allow
carrying several RPDUs in one underlying protocol data unit. [Carrying
several RPDUs in one network or transport packet reduces header overhead and
may ease synchronization between different streams.]
6.2 ST-II
The next protocol field (``NextPCol'', Section 4.2.2.10 in RFC 1190) is
used to distinguish two encapsulations of RTP over ST-II. The first uses
NextPCol value TBD and directly places the RPDU into the ST-II data area.
If NextPCol value TBD is used, the RTP header is preceded by a 32-bit header
shown below. The byte count determines the number of bytes in the RTP
header and payload to be checksummed. The 16-bit checksum uses the TCP and
UDP checksum algorithm.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| count of bytes to be checked | check sum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
... RTP header ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A Implementation Notes
We describe aspects of the receiver implementation in this section. There
may be other implementation methods that are faster in particular operating
environments or have other advantages. These implementation notes are for
informational purposes only.
A.1 Timestamp recovery
A fully specified NTP timestamp with 32 bits of full seconds and 16 bits of
resolution for the fractional seconds can be easily recovered from the RTP
timestamp. The following code stores timestamps as the 48-bit whole part of
a double precision floating point number:
#include <math.h>
typedef double CLOCK_t;
typedef unsigned long u_long;
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 24]
INTERNET-DRAFT RTP July 30, 1993
#define MAX32_bit 4294967296.
#define MAX31 0x7fffffff
CLOCK_t extend_timestamp(t, now)
u_long t; /* in: timestamp, low-order 32 bits */
double now; /* in: current local time */
{
u_long high, low; /* high and low order bits of 48-bit clock */
low = fmod(x, MAX_32bit);
high = now / MAX_32bit;
if ((low > t) && (low - t > MAX31)) high++;
else if ((low < t) && (t - low > MAX31)) high--;
return high * MAX_32bit + t;
} /* extend_timestamp */
Using the full timestamp internally has the advantage that the remainder of
the receiver code does not have to be concerned with modulo arithmetic. The
current local time does not have to be derived directly from the system
clock for every packet; a clock based on samples, e.g., incremented by the
nominal audio frame duration, is sufficient.
A.2 Detecting the Beginning of a Synchronization Unit
RTP packets contain a bit flag indicating the end of a synchronization unit.
The following code fragment determines if a packet is the beginning of a
synchronization unit:
CLOCK_t eos_t, t, now;
int flag;
struct {
unsigned int ver:2; /* version number */
unsigned int flow:6; /* flow */
unsigned int o:1; /* option present */
unsigned int s:1; /* sync bit */
unsigned int format:6; /* content type */
u_short seq; /* sequence number */
u_long ts; /* time stamp */
} *h;
t = extend_timestamp(h->ts, now);
if (h->s) {
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 25]
INTERNET-DRAFT RTP July 30, 1993
flag = 1;
eos_t = t;
}
else if (flag && t > eot_t) {
flag = 0;
/* handle beginning of synchronization unit */
}
(The structure definition has to be changed for little endian systems.)
A.3 Demultiplexing and Locating the Synchronization Source
For a combination of multicast or destination unicast address, destination
port, the flow ID determines the channel. For each channel, the receiver
maintains a list of all sources, content and synchronization sources alike
in a table or other suitable datastructure. Synchronization sources are
stored with a content source value of zero. When an RTP packet arrives, the
receiver determines its network source address and port (from information
returned by the operating system), synchronization source (SSRC option) and
content source(s) (CSRC option). To locate the table entry containing
timing information, mapping from content descriptor to actual encoding,
etc., the receiver sets the content source to zero and locates a table
entry based on the triple (network address and port, synchronization source
identifier, 0).
The receiver identifies the contributors to the packet (for example, the
speaker who is heard in the packet) through the list of content sources
carried in the CSRC option. To locate the table entry, it matches on the
triple (network address and port, synchronization source identifier, content
source).
Note that since network addresses are only generated locally at the
receiver, the receiver can choose whatever format seems most appropriate for
matching. For example, a Berkeley Unix-based system may use struct sockaddr
data types if it expects network sources with non-IP addresses.
Acknowledgments
This draft is based on discussion within the IETF audio-video transport
working group chaired by Stephen Casner. The current protocol has its
origins in the Network Voice Protocol and the Packet Video Protocol (Danny
Cohen and Randy Cole) and the protocol implemented by the 'vat' application
(Van Jacobson and Steve McCanne). Stuart Stubblebine (ISI) helped with
the security aspects of RTP. Ron Frederic (Xerox PARC) provided extensive
editorial assistance.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 26]
INTERNET-DRAFT RTP July 30, 1993
B Addresses of Authors
Stephen Casner
USC/Information Sciences Institute
4676 Admiralty Way
Marina del Rey, CA 90292-6695
telephone: +1 310 822 1511 (extension 153)
electronic mail: casner@isi.edu
Henning Schulzrinne
AT&T Bell Laboratories
MH 2A244
600 Mountain Avenue
Murray Hill, NJ 07974
telephone: +1 908 582 2262
electronic mail: hgs@research.att.com
References
[1] J. Postel, ``Internet protocol,'' Network Working Group Request for
Comments RFC 791, Information Sciences Institute, Sept. 1981.
[2] International Standards Organization, ``ISO/IEC DIS 10646-1:1993
information technology -- universal multiple-octet coded character set
(UCS) -- part I: Architecture and basic multilingual plane,'' 1993.
[3] The Unicode Consortium, T_h_e_ U_n_i_c_o_d_e_ S_t_a_n_d_a_r_d_. New York, New York:
Addison-Wesley, 1991.
[4] D. L. Mills, ``Network time protocol (version 3) -- specification,
implementation and analysis,'' Network Working Group Request for
Comments RFC 1305, University of Delaware, Mar. 1992.
[5] D. Balenson, ``Privacy enhancement for internet electronic mail: Part
III: Algorithms, modes, and identifiers,'' Network Working Group
Request for Comments RFC 1423, IETF, Feb. 1993.
[6] V. L. Voydock and S. T. Kent, ``Security mechanisms in high-level
network protocols,'' A_C_M_ C_o_m_p_u_t_i_n_g_ S_u_r_v_e_y_s_, vol. 15, pp. 135--171,
June 1983.
[7] J. Kaliski, Burton S., ``The md2 message-digest algorithm,'' Network
Working Group Request for Comments RFC 1319, RSA Laboratories, Apr.
1992.
[8] R. Rivest, ``The MD5 message-digest algorithm,'' Network Working Group
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 27]
INTERNET-DRAFT RTP July 30, 1993
Request for Comments RFC 1321, IETF, Apr. 1992.
[9] P. Mockapetris, ``Domain names -- concepts and facilities,'' Network
Working Group Request for Comments RFC 1034, ISI, Nov. 1987.
[10] P. Mockapetris, ``Domain names -- implementation and specification,''
Network Working Group Request for Comments RFC 1035, ISI, Nov. 1987.
H. Schulzrinne/S. Casner Expires 10/01/93 [Page 28]